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DETAILED ACTION 
Claim Rejections - 35 USC §112 

The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

1. Claim 9 is rejected under 35 U.S.C. 112, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which applicant 
regards as the invention. 

Claim 9 recites the limitation "the multiplications" in line 4. There is insufficient 
antecedent basis for this limitation in the claim. At best, base claim 2 states the matrix 
entries are combined by "a mathematical operation" but does not specify what that 
mathematical operation is. 

Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

2. Claims 1 , 4, 5, and 12 are rejected under 35 U.S.C. 102(b) as being anticipated 
by Besling (A Statistical Approach to Multilingual Phonetic Transcription). 
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3. In regard to claim 1 , Besling discloses a method of assignment of phonemes to a 
lexicon of words that uses a dynamic time warping algorithm (dynamic programming) to 
phonetically transcribe words by assigning phoneme sequences to grapheme 
sequences of the words (section 2). 

4. In regard to claim 4, Besling discloses that after the execution of the assignment 
of graphemes to phonemes for each word of the lexicon, the assignments are used to 
determine the position-dependent (probability of a grapheme g at a position i, g\) relative 
frequency with which the following combinations occur: 

a) a phoneme produced by two or more graphemes (phoneme stretching) 

b) two or more phonemes produced by a grapheme (grapheme stretching) 

c) two or more graphemes assigned to a phoneme (phoneme stretching), 
and 

d) a grapheme assigned to two or more phonemes (grapheme stretching). 
See page 369, lines 14-18, Fig. 2, and section 3. 

5. In regard to claim 5, Besling discloses the assignment of graphemes to 
phonemes is corrected with the aid of the position dependent frequencies (by 
performing several iterations of alignment and re-estimation, page 369, lines 4-6). 

6. In regard to claim 12, Besling discloses a computer system (automatic system) 
that executes a program that uses a dynamic time warping algorithm (dynamic 
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programming) to phonetically transcribe words by assigning phoneme sequences to 
grapheme sequences of the words (section 2). 

A computer system inherently includes a storage device for storing a computer 
program on a storage medium and a processing unit for loading the computer program 
from the storage device and executing the computer program. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claims 2, 3, 6-1 1 , and 1 3 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Besling in view of Sakoe et al. (Dynamic Programming Algorithm 
Optimization for Spoken Word Recognition). 

8. In regard to claim 2 and 13, Besling discloses a method of assigning phonemes 
to the graphemes producing them. The method is implemented in a program for 
controlling a computer (automatic system), which, inherently, must be stored on a 
computer readable medium. The method includes the following steps: 

Determining the relative frequency with which phonemes and graphemes are 
assigned to one another for each assignment of phoneme and graphemes (probability 
distribution for production of a phoneme by a grapheme, page 369, lines 3-9). 
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Creating a two dimensional matrix, one index of which is given by the grapheme 
of the word and the second index of which is given by the phoneme of the word (Fig. 1 ). 

The relative frequencies belonging to the respective phoneme-grapheme pairs 
are used as entries in the matrix (distance penalties are assigned according to the 
relative frequencies for each grapheme-phoneme pair, page 369, lines 3-9). 

Additionally, Besling discloses that the two dimensional matrix is used to align 
graphemes to phonemes by dynamic time warping (dynamic programming, page 368, 
line 25 - page 369, line 2). 

Furthermore, Besling discloses that the matrix elements along the path define the 
assignment of graphemes to phonemes of the word. 

Besling is silent as to the details of the dynamic time warping (dynamic 
programming) algorithm used to align the phonemes to the graphemes. 

Sakoe et al. discloses a dynamic time warping (dynamic programming) method. 
The method includes the following steps: 

A two dimensional matrix is given (in which two patterns A and B are developed 
along the / and j axis, respectively; herein the / axis will correspond with graphemes and 
the j axis will correspond with phonemes), in which the distance (d(/J), corresponding to 
the relative frequencies, as mentioned above) between the two patterns is used as 
entries of the matrix (Fig. 1). 

Each matrix entry is logically combined (added) with the extreme value 
(minimum) of either: 
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a) the entry for the same phoneme and the preceding grapheme in the word 
(Table 1, P=0, Symmetric case, g(i,j-1) + d(i,j)). 

b) the entry for the preceding phoneme and the same grapheme in the word 
(g(Mj) + d(ij)). 

c) and the entry for the preceding phoneme and the preceding grapheme in 
the word (g(i-1,j-1) + 2d(ij)). 

These entries are logically combined using the first phoneme of the word as the 
starting point in the mathematical operation and using the modified entries yielded from 
the mathematical operation, to determine which of the three preceding matrix entries 
was extreme to determine a step direction for that matrix entry (Fig. 4, sections lll-A and 
lll-B). 

The step direction determined for the matrix entry is defined, starting from the 
matrix entry for the last phoneme and last grapheme, and proceeding along a path 
through the matrix up to the matrix entry for the first phoneme and the first grapheme 
(Fig. 1). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the method of aligning graphemes to phonemes by using a dynamic 
time warping (dynamic programming) method, as disclosed by Besling, by using the 
specific algorithm of dynamic time warping (dynamic programming), as disclosed by 
Sakoe et al., with patterns A and B being graphemes and phonemes, respectively, 
because the algorithm is optimal and superior to several other dynamic time warping 
(dynamic programming) algorithms, as taught by Sakoe et al. (section VI). 
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9. In regard to claim 3, Besling does not disclose the relative frequencies in the first 
step are determined by selecting words from the lexicon in the case of which the 
number of the graphemes and the number of the phonemes coincide, for the selected 
words, the graphemes and phonemes are assigned to one another in the sequence of 
the specification of their graphemes and phonemes in the lexicon. 

The examiner takes official notice that it is well known and recognized in the art 
that there is no need to implement dynamic time warping when two patterns are already 
aligned (such as when the number of graphemes and phonemes is the same). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Besling so that if the number of graphemes and phonemes for the 
selected words coincided, the graphemes and phonemes would be assigned to each 
other in the sequence of specification in the lexicon, thereby reducing processing time 
because the dynamic time warping method would be implemented fewer times. 

10. In regard to claim 6, Besling discloses that after the execution of the assignment 
of graphemes to phonemes for each word of the lexicon, the assignments are used to 
determine the position-dependent (probability of a grapheme g at a position i, gj) relative 
frequency with which the following combinations occur: 

a) a phoneme produced by two or more graphemes (phoneme stretching) 

b) two or more phonemes produced by a grapheme (grapheme stretching) 



Application/Control Number: 09/943,091 Page 8 

Art Unit: 2655 

c) two or more graphemes assigned to a phoneme (phoneme stretching), 
and 

d) a grapheme assigned to two or more phonemes (grapheme stretching). 
See page 369, lines 14-18, Fig. 2, and section 3. 

11. In regard to claim 7, Besling discloses the assignment of graphemes to 
phonemes is corrected with the aid of the position dependent frequencies (by 
performing several iterations of alignment and re-estimation, page 369, lines 4-6). 

12. In regard to claim 8, Besling discloses after assigning graphemes to phonemes 
for selected words in the sequence of the specification, the corrected assignments are 
used to recalculate the relative frequency with which a phoneme is produced by two or 
more graphemes, or two or more phonemes are produced by a grapheme. Hypotheses 
for a phonetic transcription are evaluated using a matching model (that generates 
corrected assignments) that calculates the position-dependent relative frequency with 
which a phoneme is produced by two or more graphemes or two or more phonemes 
that are produced by a grapheme. All new hypotheses are recalculated for each 
possible phoneme string (Fig. 1 and page 372, lines 3-15). 

The recalculated position dependent relative frequencies are used to again 
assign graphemes to phonemes for selected words in the sequence of the specification 
(hypotheses that have been generated by the position dependent relative frequencies 
are recursively investigated, page 372, lines 13-15). 
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13. In regard to claim 9, Besling discloses in determining the relative frequencies, 
only those assignments are taken into account which the matrix entry for the last 
phoneme and the last grapheme exceeds a prescribed threshold value (page 372, lines 
10-11). 

14. In regard to claim 10, Besling discloses that the matrix entry for the first phoneme 
and first grapheme of each word (word start) and the matrix entry for the last phoneme 
and last grapheme of a word (word end) are marked to capture the special behavior at 
those points (page 371, second paragraph, lines 3-4). 

Besling does not disclose that both those matrix entries are set to 1 . 
Furthermore, Besling does not disclose that the matrix entry for the first phoneme and 
the last grapheme of each word is set to 0, or that the matrix entry for the last phoneme 
and the first grapheme of each word is set to 0. 

Sakoe et al. discloses that a slope constraint (P, equation 9) is used to prevent 
the unrealistic alignment of two patterns (such as the alignment of a first phoneme with 
a last grapheme, or a last phoneme with a first grapheme, section ll-B t Slope constraint 
condition 5, pages 44-45). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Besling so that the matrix entry first phoneme and first grapheme of 
each word (word start) and the matrix entry for the last phoneme and last grapheme of a 
word (word end) were set to 1 (indicating a 100% probability that the first phoneme will 
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align with a first grapheme and a last phoneme will align with a last grapheme), to 
ensure the proper alignment of the graphemes to the phonemes. It also would have 
been obvious to one of ordinary skill in the art at the time of invention to further modify 
Besling so that the matrix entry for the first phoneme and the last grapheme of each 
word was set to 0, or that the matrix entry for the last phoneme and the first grapheme 
of each word was set to 0, thereby implementing a slope constraint, as taught by Sakoe 
et al., in order to prevent the unrealistic alignment of two patterns, as taught by Sakoe et 
al. 

1 5. In regard to claim 1 1 , Besling discloses most transcription errors are caused by 
one or two phoneme errors in a given word (Table III and page 375, lines 14-19). 

Besling does not disclose that if in the determination of the maximum value of the 
three preceding matrix entries in the matrix entry for the preceding phoneme and the 
preceding grapheme in the word and one of the other two entries are of equal 
magnitude, the matrix entry for the preceding phoneme and the preceding grapheme in 
the word is regarded as maximum. 

Sakoe et al. discloses the determination of maximum value of the three 
preceding matrix entries (Table III, Velichko and Zagoruyko algorithm). 

Sakoe et al. does not explicitly disclose that if the entry of the preceding 
phoneme and preceding grapheme in the word and one of the other two entries are of 
equal magnitude, the matrix entry for the preceding phoneme and grapheme is 



Application/Control Number: 09/943,091 Page 1 1 

Art Unit: 2655 

regarded as maximum (no definition is given for the case when two of the entries are 
equal). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Besling to determine the maximum value of the three preceding 
matrix entries, as disclosed by Sakoe et al., so that if the entry of the preceding 
phoneme and preceding grapheme in the word and one of the two other entries were of 
equal magnitude, the matrix entry for the preceding phoneme and grapheme was 
regarded as maximum, in order to reduce the chances of a phoneme being assigned to 
an incorrect grapheme. 



Conclusion 

16. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Glickman et al. (U.S. Patent 6,076,059) discloses a method of 
aligning text with audio signals. Shaw et al. (U.S. Patent 6,363,342) discloses a system 
for developing word pronunciation pairs that has a dynamic programming phoneme 
sequence generator. Molnar et al. (U.S. Patent 6,41 1 ,932) discloses a method of 
learning word pronunciations from training corpora that uses a dynamic aligner. Kim et 
al. (U.S. Patent 6,236,965) discloses a method for creating a pronunciation dictionary 
that uses a dynamic time warping algorithm to align graphemes and phonemes. 

1 7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L Albertalli whose telephone number is (703) 305- 
1817. The examiner can normally be reached on Monday - Friday, 8:30 AM - 5:00 PM. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Smits can be reached on (703) 305-301 1 . The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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